Abstract: We usually focus on how to apply data mining techniques to data. However, it is important to manipulate data such as cleaning observations, dealing with missing values, and transforming variables. The pre-processing and transformation steps in KDD affect the performance of a model much. We compare the performances of various classification methods based on the different data set in the paper. Several data sets are generated from a real bank data. The differences in data sets are which variables are transformed. The results show that variable transformation improves the performance of a model. Additionally, the more transformations for the appropriate variables happens the more improvement in the performance of a model.
Keywords: Classification, variable transformation, customer marketing, performance comparison, neural networks